Goto

Collaborating Authors

 safe harbor


In-House Evaluation Is Not Enough: Towards Robust Third-Party Flaw Disclosure for General-Purpose AI

arXiv.org Artificial Intelligence

The widespread deployment of general-purpose AI (GPAI) systems introduces significant new risks. Yet the infrastructure, practices, and norms for reporting flaws in GPAI systems remain seriously underdeveloped, lagging far behind more established fields like software security. Based on a collaboration between experts from the fields of software security, machine learning, law, social science, and policy, we identify key gaps in the evaluation and reporting of flaws in GPAI systems. We call for three interventions to advance system safety. First, we propose using standardized AI flaw reports and rules of engagement for researchers in order to ease the process of submitting, reproducing, and triaging flaws in GPAI systems. Second, we propose GPAI system providers adopt broadly-scoped flaw disclosure programs, borrowing from bug bounties, with legal safe harbors to protect researchers. Third, we advocate for the development of improved infrastructure to coordinate distribution of flaw reports across the many stakeholders who may be impacted. These interventions are increasingly urgent, as evidenced by the prevalence of jailbreaks and other flaws that can transfer across different providers' GPAI systems. By promoting robust reporting and coordination in the AI ecosystem, these proposals could significantly improve the safety, security, and accountability of GPAI systems.


A Safe Harbor for AI Evaluation and Red Teaming

arXiv.org Artificial Intelligence

Independent evaluation and red teaming are critical for identifying the risks posed by generative AI systems. However, the terms of service and enforcement strategies used by prominent AI companies to deter model misuse have disincentives on good faith safety evaluations. This causes some researchers to fear that conducting such research or releasing their findings will result in account suspensions or legal reprisal. Although some companies offer researcher access programs, they are an inadequate substitute for independent research access, as they have limited community representation, receive inadequate funding, and lack independence from corporate incentives. We propose that major AI developers commit to providing a legal and technical safe harbor, indemnifying public interest safety research and protecting it from the threat of account suspensions or legal reprisal. These proposals emerged from our collective experience conducting safety, privacy, and trustworthiness research on generative AI systems, where norms and incentives could be better aligned with public interests, without exacerbating model misuse. We believe these commitments are a necessary step towards more inclusive and unimpeded community efforts to tackle the risks of generative AI.


Technology Helps Ensure There's No Safe Harbor for War Criminals

#artificialintelligence

In its effort to ensure there is no hiding place in the United States for war criminals, genocidaires and other human rights abusers, U.S. Immigration and Customs Enforcement has sought to harness the power of innovation, employing automated facial recognition technology and clever software algorithms to identify perpetrators who might be in, or be traveling to, America, officials told AFCEA's 2021 Federal Identity Forum and Expo Tuesday. War Crimes Hunter (WCH) is a series of customized reusable software tools built by the ICE Homeland Security Investigations (HSI) Innovation Lab in Crystal City, Virginia. It's used by HSI investigators in the Human Rights Violators and War Crimes Unit to try and identify suspected war criminals or other human rights violators. WCH automates the repetitive administrative work, while leaving key decisions to human analysts, explained Amy Nunes, a section chief in the unit. "We automate what we can automate as much as possible, still keeping an [human] analyst in the loop, because we didn't want to risk getting a bunch of [false positives or junk data] that we didn't need," she said.


The Winner Takes All: Recipe for Disaster - Netopia

#artificialintelligence

In the third decade of the commercial internet, concentration of power and money is greater than ever. Will this process stop or reverse? Or are we heading for a future of even stronger corporate dominance? Netopia talked to Jonathan Taplin, author of Move Fast and Break Things – a book which takes a closer look at the ideology and business of Silicon Valley's internet skyscrapers. Per Strömbäck: Is the "do first, ask later"-ideology the key to Silicon Valley's success?